-
Notifications
You must be signed in to change notification settings - Fork 113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Request] data.consul_service using health endpoint #87
Comments
Hi @eedwards-sk, can you give more information about your use-case? It is true that a new data source may be needed to fetch health information about a given service but maybe you can use Consul DNS interface to achieve the same goal. |
Hi @remilapeyre thanks for the response Unfortunately I cannot use dns, and I tried many, many ways to make that work first (I know more about iptables and resolv.conf now than I ever cared to). I'm doing orchestration running terraform as concourse-ci tasks and inside containers you cannot override resolvers on alpine, for example (thanks to how it queries all dns servers regardless of order), and most docker hosts (or even in my case, runc with concourse) also usually controls the resolv.conf, so it's not always possible to redirect dns to a consul agent. I tried to state the use case above, but I'll reiterate and try to expand on it further:
from # ==========
# data sources
# ==========
data "consul_service" "vault" {
name = "vault"
}
# ==========
# providers
# ==========
provider "consul" {
version = "~> 2.2"
}
provider "vault" {
version = "~> 1.4"
address = "https://${data.consul_service.vault.service.0.address}:${data.consul_service.vault.service.0.port}"
} As you can see, I'm leveraging the data source to get the address of the service. Because it's the catalog interface, it returns all nodes, regardless of health. This works brilliantly when there's no dead nodes, but breaks the moment there's a dead node, which is quite common if you have client nodes in an autoscale group or similar. So ideally I need a data source that supports the health endpoint. Not sure if it would be a generic macro scale Does that make sense? I'm doing a lot of automation and orchestration involving the set up and administration of a consul, vault, and concourse stack -- as mentioned above, you cannot easily mess with container DNS, so this is the ideal method of retrieving service node addresses. By being able to actually use the data source to get healthy addresses, then I can effectively use the data source for service discovery. As it stands, it doesn't actually serve much use, because the result set includes unhealthy nodes. I need it badly enough that I'm willing to learn Go and start a PR If I have to, but if anyone else wants to jump on it, I'd wish them all the luck and kindness in the world! |
Hi @eedwards-sk, thanks for taking the time to explain why using the DNS interface was not an option. A new I can start working on this next week. |
Awesome, please let me know if you need any testing. I'm happy to help! |
Hi @eedwards-sk, I started working on the PR, it is not fully ready yet but you can see the progress #89. You should be able to get vault healthy instances with:
Can you try it and give your feedback? |
@remilapeyre Sure, what's the correct way to load a provider overriding the default (so I can use your PR's version)? |
You can put the provider in https://www.terraform.io/docs/extend/how-terraform-works.html#discovery |
If you are not able to build it, I think I should be able to send you a cross-compiled binary |
Thanks, I'm running terraform through my tool concourse-terraform, so if I can pull in your PR and stage it into the terraform working directory, if that will work? I'd need to confirm the expected layout of the plugin dir once checked out. It's not easy to stage them in
That would work :) I'm running this inside alpine |
You can download it here: https://temp-terraform-consul.s3.eu-central-1.amazonaws.com/terraform-provider-consul The sha sum should be |
I don't think you can build it in your image but if you copy it to |
okay, I was successfully able to try it after:
I got a failure so I enabled debug output and captured it:
|
Thanks, I will do further testing tonight. |
Hi @eedwards-sk, I made a more comprehensive test but wasn't able to reproduce the bug. Could you send me the result of http://consul_hostname:8500/v1/health/service/vault if there is no confidential information in it? |
@remilapeyre absolutely! This is a testing cluster that I'll be tearing down anyways. vault service health
|
I'm a bit puzzled because in
each field should contain a duration like When testing, I run a Consul development server with
and when fetching http://localhost:8500/v1/health/service/vault I get:
where Can you tell me your versions of Vault and Consul? |
vault 1.0.1 vault automatically registers the health checks with consul https://github.com/hashicorp/vault/blob/v1.0.1/physical/consul/consul.go Edit: specifically, https://github.com/hashicorp/vault/blob/v1.0.1/physical/consul/consul.go#L827-L844 Edit2: Doesn't seem to be anything to do with vault. Consul's own serfHealth check also has a definition block like that (with keys with empty values). Maybe something to do with |
Ok, I'm using Consul 1.4.0 and the difference in behavior comes from that. There has been many changes to this endpoint between 1.4.0 and 1.4.1, I will try to update |
Ok, I updated the plugin at https://temp-terraform-consul.s3.eu-central-1.amazonaws.com/terraform-provider-consul The new shasum is |
plan with that version was successful |
Awesome 🎉 . I need to have a better test grid with multiple version of consul so this does not happen again in the future. Thanks for your help! |
Thanks for the work! |
heya @remilapeyre any progress on this? |
Hi @eedwards-sk, I'm having some issue with implementing the retry suggested in #89 (comment) I think I will make a new release in the next few days and will merge the feature as it is now, without the retry option, if I can't get it to work. |
sounds good, the retry is outside the scope of my use case anyway, but would make a good follow-up enhancement |
Currently,
data.consul_service
has almost no use for me, as I discovered it returns from the catalog and does not exercise the health check results.Ideally, I'd like to use
data.consul_service
to return the healthy addresses for a service. I'm not sure what use cases people have where they want to get unhealthy results, but I don't have one.I need to configure terraform's vault provider with the address of a healthy vault instance. With the current
data.consul_service
behavior, I end up getting back node addresses of unhealthy or left nodes.The text was updated successfully, but these errors were encountered: