Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Single Cluster Regional LB (gke-l7-rilb) https example #62

Open
boredabdel opened this issue Oct 22, 2021 · 3 comments
Open

Single Cluster Regional LB (gke-l7-rilb) https example #62

boredabdel opened this issue Oct 22, 2021 · 3 comments
Assignees
Labels
help wanted Extra attention is needed Internal LB

Comments

@boredabdel
Copy link
Member

Single Cluster Global LB with gke-l7-rilb gatewayClass and https from client to LB

@boredabdel boredabdel added the help wanted Extra attention is needed label Feb 22, 2022
@danielmarzini danielmarzini self-assigned this Feb 22, 2022
@soudaburger
Copy link

It's now been a year. This is the exact example I need to implement, but documentation is severely lacking on how to do this with TLS. All examples seem to show the global load balancer, but since GCP doesn't support managed certificates for regional LBs, I'm not entirely sure how this is expected to be implemented.

@neoakris
Copy link

Hi all this was harder than it should have been, I figured it out while I was looking into something else mildly related. I too found a lack of examples so I'll post my methodology here, I figured it out via trial and error, but I can confirm it works.

Tricky Parts that weren't well documented:

  • https://cloud.google.com/load-balancing/docs/l7-internal
    says: "Regional internal Application Load Balancer. This is a regional load balancer that is implemented as a managed service based on the open-source Envoy proxy."
  • https://cloud.google.com/load-balancing/docs/proxy-only-subnets
    says: "This page describes how to work with proxy-only subnets used by Envoy-based load balancers. A proxy-only subnet provides a pool of IP addresses that are reserved exclusively for Envoy proxies used by Google Cloud load balancers. It cannot be used for any other purposes."
  • You have to create a special subnet in the region you're using for this internal Regional L7 LB to provision correctly, and the above makes it sound like you associate that subnet with the L7 LB via a gateway.yaml, but that results in errors the subnet just has to exist, and you have to reference a normal subnet, which is super unintuitive based on how the docs are worded. (I'll elaborate on this later on in example)
  • HTTPS certs created with gcloud certificate-manager only work with global LBs
  • "classic" HTTPS certs need to be created with gcloud compute ssl-certificates ... to work with regional internal LB

Setting some Bash Shell Env Vars

export CLUSTER_NAME=cluster-1
export REGION=us-central1
export ZONE=us-central1-c
export PROJECT=chrism-playground-369416
export DOMAIN=neoakris.dev
gcloud config set project $PROJECT

HTTPS Cert Prep Work:

  • This is for a self-managed cert, I think I saw docs saying google managed cert would only work with global LB, but I might be wrong about that. it might work as long as it's generated with gcloud compute ssl-certificates ...
# [admin@workstation:~]
mkdir -p ~/cert
cd ~/cert
docker run -it --entrypoint=/bin/sh --volume $HOME/cert:/.lego/certificates   docker.io/goacme/lego:latest

# [shell@dockerized-ACME-client:/]
# (Note: /lego is intentionally using full path to binary, lego alone will say lego not found in path)
/lego --email "[email protected]" --domains="*.neoakris.dev" --dns "manual" run

# Press Y to accept TOS
# Following directions to manually update DNS with a _acme-challenge.neoakris.dev record according to the CLI feedback
# 2023/08/17 03:43:43 [INFO] [*.neoakris.dev] acme: Preparing to solve DNS-01
# lego: Please create the following TXT record in your neoakris.dev. zone:
# _acme-challenge.neoakris.dev. 120 IN TXT "y1HOVUQthxMIBcQLYTn18j5pbwWGxOK770g4_wvV7Tw"
# lego: Press 'Enter' when you are done
# 
# ^-- Manually logged into DNS admin portal to create a TXT record, then pressed enter
#
# 2023/08/17 03:45:07 [INFO] [*.neoakris.dev] acme: Validations succeeded; requesting certificates
# 2023/08/17 03:45:08 [INFO] [*.neoakris.dev] Server responded with a certificate.

# [shell@dockerized-ACME-client:/]
exit

# [admin@workstation:~/cert]
ls
# _.neoakris.dev.crt		_.neoakris.dev.issuer.crt	_.neoakris.dev.json		_.neoakris.dev.key
# ^-- These files were created by lego, cli 
# (Public Internet CA of Lets Encrypt provided the cert and key, signed by a public internet CA.crt
#  that's baked into operating systems / doesn't require additional configuration to work.)

gcloud compute ssl-certificates create my-imported-cert --certificate=_.neoakris.dev.crt --private-key=_.neoakris.dev.key --region=$REGION
# ^-- this lets it be attached to GCP managed internal Regional L7 LBs to termiante HTTPS at the LB
# v-- some verification commands
gcloud compute ssl-certificates list
gcloud compute ssl-certificates describe my-imported-cert

Subnet and Reserved Static Private IP Prep Work for gke-l7-rilb class of GKE Gateway API Controller:

  • Note: default VPC auto created subnets reserve 10.128.0.0/9 (10.128.0.1 - 10.255.255.255)
    that means 10.(0-127).x.y is fair game, so in the below commands I arbitrarily chose to reserve
    10.127.127.0/24 to be the subnet dedicated to us-central1's Internal Regional LBs.
  • This thing just has to exist in the region your GKE cluster exists in
export LB_SUBNET_NAME=regional-managed-proxy-only
export GKE_HTTPS_GATEWAY_LB_IP_NAME=https-gateway-lb-private-ip

gcloud compute networks subnets create $LB_SUBNET_NAME --purpose=REGIONAL_MANAGED_PROXY --role=ACTIVE --region=$REGION --network=default --range=10.127.127.0/24

gcloud compute addresses create $GKE_HTTPS_GATEWAY_LB_IP_NAME --purpose=SHARED_LOADBALANCER_VIP --region=$REGION --subnet=default
# ^-- pre-creates a reserved internal static IP, for gke-l7-rilb,
#      not needed but good for consistency between IaC based tear downs
#      Thing that's super unintuitive is the docs make it sound like this should
#      reference --subnet=$LB_SUBNET_NAME but that leads to cryptic non-helpful 
#      error / failure when provisioning gateway.yaml
#      Another stupid thing is all those purpose flags are important... for this to work right

export GKE_HTTPS_GATEWAY_LB_IP_VALUE=$(gcloud compute addresses describe $GKE_HTTPS_GATEWAY_LB_IP_NAME | grep address: | cut -d ' ' -f 2)
# ^--looks up the value and stores in shell env var
echo $GKE_HTTPS_GATEWAY_LB_IP_VALUE
# 10.128.0.37 
# ^-- This is a reserved static private IP that will be used by the LB, so you can configure DNS in advance if you like

Step 1: provision a GKE standard zonal sandbox cluster (1 node was enough for testing)

  • few manual clicks in the GUI

Step 2: Enable GKE Gateway API Controller

  • One method is to just install the CRDs which in theory is a kubectl apply -f, but it's best to use this to let the CRDs be managed by GKE.
kubectl get crd | grep gateway
# ^-- verification command shows nothing = not enabled
gcloud container clusters update cluster-1 --gateway-api=standard
kubectl get crd | grep gateway
# ^-- verification command shows 5 gateway api specific CRDs, so it's enabled

Step 3: Install an example app and tester pod, so we'll be able to test / verify working as expected

helm upgrade --install podinfo oci://ghcr.io/stefanprodan/charts/podinfo --namespace default
kubectl get svc 
# ^-- service is named podinfo   listens on 9898

kubectl run -it curl --image=docker.io/curlimages/curl -- sh
# [shell@pod-with-curl-that-can-talk-to-private-ip-lb: ~]
exit 
# [admin@workstation:~]

Step 4: Deploy Gateway API Resources to provision Internal L7 Regional LB

tee gateway.yaml  << EOF
apiVersion: gateway.networking.k8s.io/v1beta1
kind: Gateway
metadata:
  name: internal-https-gateway
  namespace: default
spec:
  gatewayClassName: gke-l7-rilb # Regional Internal LB
  listeners:
  - name: http
    protocol: HTTP
    port: 80
    allowedRoutes:
      kinds:
      - kind: HTTPRoute
      namespaces:
        from: All
  - name: https
    protocol: HTTPS
    port: 443
    tls:
      mode: Terminate
      options:
        networking.gke.io/pre-shared-certs: my-imported-cert # <-- made with 'gcloud compute ssl-certificates ...'
    allowedRoutes:
      kinds:
      - kind: HTTPRoute
      namespaces:
        from: All
  addresses:
    - type: NamedAddress   # Allows use of pre-provisioned predictable IP, vs dynamicly provisioned.
      value: "https-gateway-lb-private-ip"  # <-- created earlier `gcloud compute addresses list`
EOF

tee httproute.yaml  << EOF
kind: HTTPRoute
apiVersion: gateway.networking.k8s.io/v1beta1
metadata:
  name: example
  namespace: default
spec:
  parentRefs:
  - name: internal-https-gateway #<-- reference to gateway's name, must match
  hostnames:
  - "gateway-api-example.neoakris.dev"
  rules:
  - backendRefs:
    - name: podinfo
      port: 9898
EOF

kubectl apply -f gateway.yaml
kubectl apply -f httproute.yaml

# kubectl describe (both objects), to verify both have finished (takes 2-5 min)

Step 5: Test / Verify

  • This test uses curl pod that was created earlier, this test works without having to configure DNS or remote access
# [admin@workstation:~]
kubectl exec -it curl -- sh
# [shell@pod-with-curl-that-can-talk-to-private-ip-lb: ~]
curl --location https://gateway-api-example.neoakris.dev/ --resolve gateway-api-example.neoakris.dev:443:10.128.0.37
# {
#   "hostname": "podinfo-58d6d8bfd8-qwgkt",
#   "version": "6.4.1",
#   "revision": "4892983fd12e3ffffcd5a189b1549f2ef26b81c2",
#   "color": "#008000",
#   "logo": "https://raw.githubusercontent.com/stefanprodan/podinfo/gh-pages/cuddle_clap.gif",
#   "message": "greetings from green app",
#   "goos": "linux",
#   "goarch": "amd64",
#   "runtime": "go1.21.0",
#   "num_goroutine": "8",
#   "num_cpu": "2"
# }

@neoakris
Copy link

neoakris commented Aug 18, 2023

o right btw this isn't how I'd do this, if I was going to do it for real. (I did it this way as I was looking into something for a customer of DoiT International, https://doit.com, GCP Partners with great support at no cost to customers.)

Unfortunately for whatever reason GCP doesn't support managed certs (as in auto provision auto rotate) for private IP / internal LBs. (source: https://cloud.google.com/kubernetes-engine/docs/concepts/gateway-security#tls-support)
(I say unfortunately, because there's no valid reason why they can't support it from a technological standpoint, just seems to be an unwillingness to prioritize development of the functionality.)

I'd combine GCP LB controller (Ingress/GatewayAPI)'s ability to upload a HTTPS cert embedded in a kube tls secret to the GCP managed LB, with cert-manager.io (a cloud agnostic method of getting a software bot managed cert that auto provsions and auto rotates, only difference is it's embedded in a kube secret).

https://cert-manager.io is a "kubernetes operator" (software bot / app running as a pod in kube cluster)
it can auto provision and auto rotate a wildcard cert *.neoakris.dev provisioned using DNS ACME challenge
against "Lets Encrypt" a free Public Internet Certificate Authority. (as in the HTTPS cert it gives will be signed by a CA, where the trust of that CA is baked into modern operating systems, so there's no need to add a CA as trusted.)

The cert-manager.io kube operator would create a kube secret of type tls (containing the wildcard cert, and it'd keep it provisioned / auto rotated, basically the software bot would manage the HTTPS cert for you)

Then I'd configure the Gateway API kube custom resources to reference the kube tls secret instead of a GCP pre-shared-cert.

That way it'd be maintenance free / no need to manually rotate certs, and wouldn't have to mess with self-managed CA / PKI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed Internal LB
Projects
None yet
Development

No branches or pull requests

4 participants