Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

E2E: Consul WI test needs refactoring #19698

Closed
pkazmierczak opened this issue Jan 10, 2024 · 4 comments
Closed

E2E: Consul WI test needs refactoring #19698

pkazmierczak opened this issue Jan 10, 2024 · 4 comments

Comments

@pkazmierczak
Copy link
Contributor

pkazmierczak commented Jan 10, 2024

I recently removed a broken Consul WI integration test from the E2E suite, because it requires a bigger refactoring. This test fails because we use HCP Consul which won't be able to connect to Nomad's JWKS endpoints over TLS. A solution would be to run Nomad and Consul locally for that test, much like we do in the consulcompat tests. Ideally we should extract common configuration and setup methods from consulcompat tests into e2eutil and rewrite the removed test.

@gulducat
Copy link
Member

Might be able to expose /.well-known/* to public internet for Consul to reach (maybe a Task API proxy job?), but would also require some cloud plumbing to TLSify it, and route traffic appropriately...

tgross added a commit that referenced this issue Apr 2, 2024
Our `consulcompat` tests exercise both the Workload Identity and legacy Consul
token workflow, but they are limited to running single node tests. The E2E
cluster is network isolated, so using our HCP Consul cluster runs into a
problem validating WI tokens because it can't reach the JWKS endpoint. In real
production environments, you'd solve this with a CNAME pointing to a public IP
pointing to a proxy with a real domain name. But that's logisitcally
impractical for our ephemeral nightly cluster.

Migrate the HCP Consul to a single-node Consul cluster on AWS EC2 alongside our
Nomad cluster. Bootstrap TLS and ACLs in Terraform and ensure all nodes can
reach each other. This will allow us to update our Consul tests so they can use
Workload Identity, in a separate PR.

Ref: #19698
tgross added a commit that referenced this issue Apr 2, 2024
Our `consulcompat` tests exercise both the Workload Identity and legacy Consul
token workflow, but they are limited to running single node tests. The E2E
cluster is network isolated, so using our HCP Consul cluster runs into a
problem validating WI tokens because it can't reach the JWKS endpoint. In real
production environments, you'd solve this with a CNAME pointing to a public IP
pointing to a proxy with a real domain name. But that's logisitcally
impractical for our ephemeral nightly cluster.

Migrate the HCP Consul to a single-node Consul cluster on AWS EC2 alongside our
Nomad cluster. Bootstrap TLS and ACLs in Terraform and ensure all nodes can
reach each other. This will allow us to update our Consul tests so they can use
Workload Identity, in a separate PR.

Ref: #19698
@tgross tgross self-assigned this Apr 17, 2024
philrenaud pushed a commit that referenced this issue Apr 18, 2024
Our `consulcompat` tests exercise both the Workload Identity and legacy Consul
token workflow, but they are limited to running single node tests. The E2E
cluster is network isolated, so using our HCP Consul cluster runs into a
problem validating WI tokens because it can't reach the JWKS endpoint. In real
production environments, you'd solve this with a CNAME pointing to a public IP
pointing to a proxy with a real domain name. But that's logisitcally
impractical for our ephemeral nightly cluster.

Migrate the HCP Consul to a single-node Consul cluster on AWS EC2 alongside our
Nomad cluster. Bootstrap TLS and ACLs in Terraform and ensure all nodes can
reach each other. This will allow us to update our Consul tests so they can use
Workload Identity, in a separate PR.

Ref: #19698
@tgross
Copy link
Member

tgross commented Apr 26, 2024

See also #19250

@tgross
Copy link
Member

tgross commented May 13, 2024

I'm going to close this issue as this was effectively covered by work in #20278. I've still got #19250 open to clean up the remaining Consul E2E tests.

@tgross tgross closed this as completed May 13, 2024
Copy link

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 28, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants