Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't allow setting dead server last contact threshold to less than 1 minute #22040

Merged
merged 6 commits into from
Jul 25, 2023

Conversation

raskchanky
Copy link
Contributor

This is a frequent source of problems when people tune their autopilot configs. I took a quick pass over the other autopilot config values and didn't see any that screamed for validation quite as much as this one, but if there are others that people think should be validated for some valid range of values, I'm happy to add those.

@github-actions github-actions bot added the hashicorp-contributed-pr If the PR is HashiCorp (i.e. not-community) contributed label Jul 24, 2023
@github-actions
Copy link

github-actions bot commented Jul 24, 2023

CI Results:
All Go tests succeeded! ✅

@github-actions
Copy link

Build Results:
All builds succeeded! ✅

Copy link
Collaborator

@ncabatoff ncabatoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be a docs update with this to document the minimum?

@mpalmi
Copy link
Contributor

mpalmi commented Jul 25, 2023

This seems like a candidate for a changelog as well.

@raskchanky
Copy link
Contributor Author

I will add both.

@raskchanky raskchanky requested a review from a team as a code owner July 25, 2023 16:02
@msdundar
Copy link

msdundar commented Aug 30, 2023

@raskchanky I think in some pieces of official Vault documentation 10 seconds was in place, and I assume people simply adopted that. I believe a best-practices document or some kind of a guideline about autopilot flags could've been very useful as the difference between 1 minute and 24 hours (the default value) for dead server last contact is still too big!

@raskchanky
Copy link
Contributor Author

@msdundar I think you're probably right re: 10 seconds being somewhere in the official docs. I'll see if I can track it down. We designed the autopilot defaults to be appropriate for most use cases, so most of the time, tuning isn't necessary. I can see now that we don't mention that in our docs, however, and in fact, the language used could be easily interpreted to suggest that people should tune these. I appreciate you calling out these inconsistencies in our docs. I'll do my best to get them updated.

@msdundar
Copy link

msdundar commented Aug 30, 2023

If I'm not mistaken, it was this tutorial, and 10s was in the example that is right under the Note section. However, the example seems to be updated to 1m as well:

vault operator raft autopilot set-config \
    -dead-server-last-contact-threshold=1m \
    -server-stabilization-time=30 \
    -cleanup-dead-servers=true \
    -min-quorum=3

Many thanks for your quick reply and thanks for taking care of the minimum value! 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hashicorp-contributed-pr If the PR is HashiCorp (i.e. not-community) contributed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants