-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Need better story for multiple datacenter scenarios #633
Comments
Even when that lands I think you're better off using the physical store's built-in capabilities (whatever they are). Not everything will be able to be enumerated, and even if so, you won't catch e.g. the master keys this way. |
This would be very useful. Both (1) and (2) would be an issue for us as well. Something similar to High Availability, which some backends currently support, would be great. Consul + replication and some sugar to support this in Vault would be great. |
Our plan is to run a separate vault cluster (on the same nodes that run consul masters for that DC) in each datacenter, since in our environment each datacenter is on it's own in a availability-zone/region/continent and may not have very good connectivity back to our core. then have a "master vault" somewhere secure which has the "master copy" of all secrets. This vault cluster will usually be selaed. To deploy a new or update secrets in a datacenter we would unseal this "master vault", pull the secrets and all configurations needed for a destination datacenter and push it to the destination vault, then reseal the "master vault" again. This "master vault" would also be the only place where we store the root token for each destination datacenter. If it would be possible to list everything inside vault we could easily retrieve / list all secrets and policies for any vault and quite easily script deploys / copying of (selected) secrets to new datacenters and keep track of / delete what shouldn't be there anymore. So, the procedure would be something like this: Unseal master vault - (These keys would be stored in a safe, as well as one key per security-person, to unseal without access to the safe we just need to make sure we have at least number of people who agree on it being unsealed) init new "slave-vault" -> store keys in master vault read /slave-vault-dc-name -> policies for slave-vault (list of secrets, policies, app-id's and such to deploy?, some datacenters only have few services, others need more) Read / list all secrets / policies currently in slave-vault -> store in script as "current" |
#674 may be of interest. The person there is using consul-replicate to sync data from a master DC to separate HA Vault instances in other DCs. A minor bug to work out, but other than that it seems to be working. |
Handling some requests to be fulfilled locally like one-time use credentials and transit (encrypting) would also minimize unnecessary requests back to the master. Anything that doesn't require a verified write could be cached until it can successfully be written to the master. |
At this point I doubt we will ever have a distributed, asynchronous, eventually-consistent-with-a-master approach as @HorseHay is describing. There are a lot of complexities with this approach, and one thing we value very, very highly with Vault (due to it handling security-sensitive information) is predictability. The information in the article that I linked to from the mailing list could be used to distribute the transit keys to replicas, allowing local clients to use the local machines for transit; this would work quite well since transit doesn't generate leases. |
Perhaps something simple to begin with, e.g. local caching (if that is indeed simple, perhaps it's not) or forwarding (where the local node simply forward to master + act as a standby). Perhaps this is already available today, and in that case the improvement may simply be to polish the documentation. |
Hi all, We have a big Consul multi-datacenter clusters deployment and we are going to deploy a separate independent Vault clusters along each consul quorum in each datacenter. Each Vault cluster will have an independent master key (thus unsealing tokens) and will store datacenter specific secrets being at the same time independent from all others clusters. That's why consul-replication discussed in #674 would not work (the data is encrypted with its own keys in each dc). What we are looking into is having an option of setting up application level partial replication (sub-tree). Virtually, we would like to have application level master-master replication abstracted off any certain backend. We are going to organize our own private internal Vaults to be a sink of all others and to provide kind of backup and duplication (so the system will be more reliable at the face of outage or splits - we do cross dc operations). Maybe the Vaults cross dc will be paired or whatever, but we want to have a choice on how we organize HA here with redundancy. Having read what internet/github have on the topic, I came up with the following idea. I would be glad to have your feedback on this. Latest Vault introduced request forwarding feature for HA mode. I am offering using that feature (or at least derivative of that), to replicate write requests based on configurable filters to any other configured datacenters. It could do write through or write back, but it will allow: a) flexible propagation of the data across the replication streams The main problem this scheme involves is consistency. But for our usage scenario, its not a problem. Usually, we do not do parallel writes to multiple datacenters (so no conflicts on write events w/o any order) and we expect all datacenters to be available while rare MANUAL writing. If here something more secure required, the write ops can be implemented in sync way, so returning OK, only after all sinks are written. This could also guarantee consistency if add write rollbacks on conflicts for whoever want it. |
Replication is a part of Vault Enterprise which solves this use-case, closing! |
@jefferai I'm not sure "you can buy our proprietary version" is an appropriate reason to close a feature request on an open source project? I can understand why Hashicorp might decline to work on it or even merge any of the resulting community work, but if others want to collaborate on it why interrupt that? |
Currently, Vault isn't really practicable to use in a multi-datacenter environment. You can set up a Vault server and its backup server in a single datacenter, but there are at least two significant issues with this:
Another option is to have a set of independent Vault instances in multiple datacenters, but it then becomes the administrator's job to ensure that the data is consistent among them all -- and that won't be practicable until the often-requested enumeration feature is implemented.
Consul can be used to replicate key-value data from a master datacenter to other datacenters (via consul-replicate). Consul is also one of the supported storage backends for Vault. So it seems logical that -- at least for generic secret storage backed by Consul -- one should be able to direct write requests to a Vault instance located in and connected to the "master" datacenter, and direct read requests to a Vault instance located in and connected to the closest replica datacenter.
Consul and PKI backends could probably also benefit from multi-datacenter support as well.
The text was updated successfully, but these errors were encountered: