-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Databases/MongoDB - error 500 on creds generation when mongodb replicaset primary step-down #2973
Comments
Do you see this problem using the combined database backend? https://www.vaultproject.io/docs/secrets/databases/mongodb.html The existing one is deprecated at this point. If this is a setup I'd recommend using that instead. |
Yes. I use https://www.vaultproject.io/docs/secrets/databases/mongodb.html beta backend and I observe problem with it, not with deprecated one. Sorry for misleading name of mountpoint. In this ticket this name is fake for public use and it has format RSNAME_mongodb. I currently mount only one combined database backend. |
@drakeu when you got that EOF Vault probably printed out a panic into its server log (which goes to stdout). Can you supply Vault's server logs from when that's run? |
Logs when I try to release 2 instances of service (java spring with spring cloud vault configuration) which using databases/mongodb backend. In log I have only two entries in this step:
After few minutes I decide to try step-down current master:
After that when I run
But there is no results in logs for that operation. Next I step-down current vault master and it back to normal. |
@drakeu Thanks, that helps clarify whether the EOF is coming from the Mongo connection itself (it is) or from Vault due to a panic (it's not). Any chance you would be able to build a branch? I could push in a change to try to work around this. |
I send you invitation. I'm not golang developer so I can help only in testing. |
Hi @drakeu , Please try out the |
Hi. It looks that it still not working. What I do:
|
Edit: Nevermind, that looks like it's a valid format. |
I just pushed up another change, please try again (and importantly, be sure to again include logs if it doesn't work). |
I'm sorry. I have exactly the same situation as previous. I remove repo directory, clone again, checkout to branch (your commit: c8886bd from Tue Jul 18 19:20:10 2017 -0400), rebuild project (
Logs from vault bellow:
|
From the mongoDB docs:
You might have to restart the plugin to reset the connection: https://www.vaultproject.io/api/secret/databases/index.html#reset-connection |
Thank you for this solution. As I wrote before 500 error occures when primary steps down. For my test case I use rs.stepDown() only to simulate this situation. But previously I have this situation when 3 node production replicaset change primary automaticaly. You can easy simulate this situation by killing primary node for previous example or if you use mongo replicaset in docker containers pause container with primary. In production environment mongodb replicaset can automaticaly change primary in many situations (network partitions, vm problems, hardware problems, etc). Or other situation - rolling maintenance (for example mongodb version upgrade). I think that is not expected production ready behaviour for vault when it have problems with connection to new primary (especially when it knows other members).
I kill current primary:
Vault not reconnect as previously:
|
Did you try resetting/reloading the backend? |
Yes. Connection reset endpoint works when I trigger it from curl:
Logs:
This behaviour can be automatic? I mean - vault resets connection when recognize that current primary is now secondary? |
That's what my branch is trying to do. It's not clear to me why it's not working but I haven't had a chance to circle back around to it. |
Any progress? |
No, I don't know of a way to check the connection state, and for some reason attempting to see and take action on the error isn't working. |
We ran into this issue after migrating from the deprecated MongoDB backend to the database backend. Never experienced it with the old backend. Unfortunately, 0.8.3 doesn't fix it for us. After a change of the primary, any attempt to read database credentials from vault will block until it times out. |
@jeinwag That was a very useful comment as it pointed us away from the EOF being the issue and back towards a difference between the old and new backends. I think I have a fix, which will go into 0.9.1 -- please try that when it's out and let us know. |
This was in the deprecated backend where it fixed a similar issue a long time ago but for some reason didn't make it over. Additionally the function wasn't being locked properly. Hopefully fixes #2973
OK, I just tested this with 0.9.1: I triggered a step down of the primary on the MongoDB cluster and tried reading credentials from the MongoDB backend. At first I got an EOF: But an immediate second attempt was successful. |
There seems to be another issue related to this. After change of the MongoDB primary has happened, vault won't shut down properly:
Then nothing after "stopping rollback manager". |
I think I’ve found the potential issue, and will get back to you once I get a branch with the fix pushed. |
Just gave 0.9.3 a shot, looking good now! Thanks! |
On the newest version of Mongo (4.2), vault will not return an EOF error but will instead return "not master". You've done some great work on fixing this issue (@jefferai and @calvn) and it would be awesome if you could add "not master" to the switch case statement. That will make the plugin work with 4.2 as well.
to
should fix it for mongo 4+! |
I have following environmnent setup:
3 node mongodb replicaset
3 node vault (v0.7.3) cluster with store on 3 server consul (0.8.3) cluster
Configured databases - mongodb secret backend.
My anonymised connection_url:
connection_url:mongodb://USER:[email protected]:27017,y.y.y.y:27017,z.z.z.z:27017/admin?replicaSet=REPLICASETNAME
Problem:
After configuration it works perfect. But when mongodb steps down primary and select new one I receive following error when I try generate credentials:
I temporary fix situation by running vault step-down. After that vault starts generate credentials.
The text was updated successfully, but these errors were encountered: