-
-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix issues with race conditions connecting to Vault #65
Conversation
This looks like a good approach, I'll see if I can quickly test it and merge this week. Can you rebase as well? Just merged a fix to ENV setup |
b2a82ea
to
5e08ebc
Compare
For sure - rebased on 4376028. (I've kept it as two commits - one for the critical section, one for the variable rename to avoid environment escape - but shout if you'd prefer it as one.) |
…versions of this function and this one in different environments
5e08ebc
to
ebdbf8f
Compare
@rmc47 So sorry, I completely forgot about this PR. I've just merged in some logic for comma-based pathing that has a bunch of refacotring, are you ok to rebase again? If not no worries, I think I get the jist of the logic here so I think I can do it. |
- Building on the original work in #65
Ok, I think I've rebased it correctly, gonna see if I can manually check this in an acceptance test environment 👍🏻 |
Amazing, thank you! To be honest, we'd entirely forgotten about it as well and have been running off my branch for the last couple of months 😆. Would be great to get back on to main if you're happy with the fix! |
Glad to hear! Are you ok to test this particular rebased branch as well to confirm it works? If so, I'll merge and cut a new 2.0.0 version of the module as a potential breaking change, then you can move back to the master branch 👍🏻 |
@petems that's been running for the last few days, and all looks good our side! |
Awesome, lets get this merged and a new release cut! 😄 |
Related to #61 and possibly hashicorp/vault-ruby#216, we've been seeing a large number of
Vault::ConnectionPool::PoolShuttingDownError
errors running hiera_vault on a Puppet server.There looks like there's two (related) possible race conditions in play: while a single Puppet JRuby instance serves one request at a time, I think the background thread that the Debouncer calls shutdown on can happen concurrently with the main thread's use of the Vault client.
I think this flow is possible:
I think there's also an issue where $vault gets set to nil following a StandardError, but not recreated when it's used on a subsequent request?
This addresses both scenarios by ensuring all use of $vault happens within a critical section. This should not substantially affect performance: the mutex should always be uncontended except where the shutdown and vault_get calls happen at the same time. It also ensures (within the critical section) that the vault client is not nil, and creates a new one if required.
GitHub's diff is pretty unhelpful due to indentation changes, I'm afraid - ignoring whitespace changes gives something much more readable. I've changed the names of the global variables to avoid any conflicts with other global variables from older versions of this module running concurrently in other Puppet environments, as I think Ruby globals are not isolated between environments.
(Caveat: my Ruby knowledge is pretty poor - apologies if I've taken the wrong approach on this.)