Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Add support for StalledDiskPrimary recoveries by vtorc #16375

Open
joekelley opened this issue Jul 12, 2024 · 0 comments · May be fixed by #17470
Open

Feature Request: Add support for StalledDiskPrimary recoveries by vtorc #16375

joekelley opened this issue Jul 12, 2024 · 0 comments · May be fixed by #17470
Assignees
Labels
Component: VTorc Vitess Orchestrator integration Type: Feature

Comments

@joekelley
Copy link

Feature Description

At HubSpot we have had a handful of incidents where a primary becomes impaired due to disk issues. When this happens, we observe that vtorc assigns an UnreachablePrimary analysis which leads to a no-op recovery because the FullStatus
call it makes to the tablet times out. We monitor for these cases outside of Vitess and resolve them by running ERS, but it would be ideal if vtorc could detect and address these cases itself.

Use Case(s)

This recovery would be useful when:

  • a hardware/infrastructure fault leads to the disk becoming unavailable.
  • a primary enters a sustained period of severe i/o latency where a failover is preferable to waiting for the underlying issue to be resolved.
@joekelley joekelley added Needs Triage This issue needs to be correctly labelled and triaged Type: Feature labels Jul 12, 2024
@mattlord mattlord added Component: VTorc Vitess Orchestrator integration and removed Needs Triage This issue needs to be correctly labelled and triaged labels Jul 15, 2024
@GuptaManan100 GuptaManan100 linked a pull request Jan 7, 2025 that will close this issue
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: VTorc Vitess Orchestrator integration Type: Feature
Projects
None yet
2 participants