-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
issues with self hosted kubernetes deployment (zrok, ziti-controller, ziti-router) #272
Comments
Pull request to mitigate leading hyphens in Ziti ID strings - #274 Issue to raise concern about the underlying problem - openziti/ziti#2534 |
That would be most welcome. There's a pattern in the ziti-controller and ziti-router charts for mounting additional volumes, but you may have a better way already established you could use for extraEnv vars from secret mounts or existing identities, or both. Another user reported this issue too in #273
That's understandable. The scripts for zrok controller and zrok frontend show a bias for simplicity at the cost of flexibility. If we need significantly more flexibility, it would be wise to consider if there's another approach better suited than shell scripts. I'm reluctant to try to accomplish too much with shell scripts because they can become quite challenging to maintain. |
Do you have any theories how this breaks with ArgoCD? Does ArgoCD significantly depart from the typical workflow of running the
This sounds like it could stem from the same root. I haven't seen the problem you're describing myself, so I'm guessing it could be related to how ArgoCD works. Here's the part of # granted permission to read secrets in namespace by SA managed by this chart
if kubectl -n {{ .Release.Namespace }} get secret \
{{ include "zrok.fullname" . }}-ziggy-account-token &>/dev/null; then
echo "INFO: ziggy account enable token secret exists"
else
echo "INFO: ziggy account enable token secret does not exist, creating secret"
# create a default user account named "ziggy" and save the enable token in a Secret resource
zrok admin create account \
ziggy@{{ .Values.dnsZone }} \
{{ $ziggyPassword | b64dec | quote }} \
| xargs -I TOKEN kubectl -n {{ .Release.Namespace }} create secret generic \
{{ include "zrok.fullname" . }}-ziggy-account-token \
--from-literal=token=TOKEN
# xargs -r is NOT used here because this command must fail loudly if the account token was not created
fi And, here's the part of that script that creates the zrok "public" frontend if it does not already exist in Ziti. # if default "public" frontend already exists
ZROK_PUBLIC_TOKEN=$(getZrokPublicFrontend token)
if [[ -n "${ZROK_PUBLIC_TOKEN}" ]]; then
# ensure the Ziti ID of the public frontend's identity is the same in Ziti and zrok
ZROK_PUBLIC_ZID=$(getZrokPublicFrontend zid)
if [[ "${ZITI_PUBLIC_ID}" != "${ZROK_PUBLIC_ZID}" ]]; then
echo "ERROR: existing Ziti Identity named 'public' with id '$ZITI_PUBLIC_ID' is from a previous zrok"\
"instance life cycle. Delete it then re-run zrok." >&2
exit 1
fi
echo "INFO: updating frontend"
zrok admin update frontend "${ZROK_PUBLIC_TOKEN}" \
--url-template "{{ .Values.frontend.ingress.scheme }}://{token}.{{ .Values.dnsZone }}"
else
echo "INFO: creating frontend"
zrok admin create frontend -- "${ZITI_PUBLIC_ID}" public \
"{{ .Values.frontend.ingress.scheme }}://{token}.{{ .Values.dnsZone }}"
fi |
A zrok instance requires a Ziti network, and a Ziti network requires at least one router and controller. The router(s) and controller(s) are typically separate deployments, and we're starting to explore using StatefulSets to describe multi-router and multi-controller deployments.
I was thinking the same thing but never finished working on that branch. I like the idea of the Ziti controller immediately creating a first router named like "default" or "public" and storing the enrollment token in a K8S secret to simplify the router deployment that typically follows on its heels. Another option in mind is a separate umbrella chart like "ziti-stack" that orchestrates the router enrollment parcel to the controller deployment. That might work, but an Operator feels like the better tool for the job of automating life cycle, ops, etc. |
Now I see pull request to prune the obsolete value: #275 |
Sorry for the late response, after some fiddling around managed to start zrok with ziti. Ziti-controller needs to start first, then ziti-router + need to create router policies on ziti-controller, and then start zrok which in turn will successfully create a private/ public share. ziti edge create edge-router router-dev \
--role-attributes "public" --tunneler-enabled --jwt-output-file /tmp/router-dev.jwt
ziti edge create edge-router-policy all-endpoints-public-routers --edge-router-roles "#public" --identity-roles "#all"
ziti edge create service-edge-router-policy all-routers-all-services --edge-router-roles "#all" --service-roles "#all"
Issues from ArgoCD mostly arouse when deleting resources which in turn also deleted the secret and PVC which stored sqlite database meaning zrok tried to bootstrap once again but the identities were already existing in ziti which caused error. In theory hooks and all the resources could be managed better by a kubernetes operator pattern with some custom CRDs but I'm not so experienced with it but it would probably make most sense. Umbrella chart probably would be easier to maintain but that gives less flexibility. ArgoCD essentially renders the helm chart with helm template and applies those manifests with kubectl, most hooks are working the same way and are mapped to argo-cd hooks on injection.
I will try to get to it this week and open PR. |
There might be a problem with setting db password for zrok using ArgoCD. I am not sure if ZROK supports environment variables in the ctrl.yaml file which is generated here: https://github.com/openziti/helm-charts/blob/main/charts/zrok/templates/controller-secrets-configmap.yaml#L272 I wanted to use a lookup for secret to replace the value for db password but I'm afraid that wont work with argocd argoproj/argo-cd#5202 An easier way would be just setting env variables there and application would read them from env. If that is not an option then as a dirty workaround initContainer could expand the script with envsubst and mount it on zrok. |
Correct, zrok doesn't support env vars in its configs yet. Here's a couple of GitHub issues tracking improved config handling, including env vars: I used Does this accurately summarize the password issue with ArgoCD? ArgoCD consumes the and applies the manifests generated by the |
I couldn't think of a way to refactor the charts to be compatible with a GitOps workflow without adding manual steps to the main Helm-driven workflow, which involves calling the Kube API to manage existing resources and trigger life cycle hooks. I'm not giving up on GitOps by any means. In the meantime, maybe you could insert a Kustomize step in your GitOps workflow like this:
|
Hey, Happy New Year! Hope you had good holidays :) I started working on some changes which actually are working but probably needs more work as I got sick before holidays and basically stopped there. I need to run some additional tests but so far secrets are working with envsubst. There is a breaking change which I added to support existingSecret for jwtToken to have uniform resource definition between other helm charts as well but that can be moved to a separate variable as well without breaking changes. I will open PR and then let me know what you think. I'm afraid that I would have to refactor our whole argocd repo for additional kustomize steps in some adhoc cases. So am I understanding correct that moving pre-delete hook to post-delete hooks would not work properly? |
The purpose of that pre-delete hook is to delete the public frontend's identity secret from the cluster. When driven by Helm, it needs to run before the service account is deleted because that's how it gets permission to manage secrets in the release namespace. |
Since you're generating templates from the chart, you could delete that hook entirely if you're managing the life cycle of the identity secret another way. |
Yeah, we are using external secrets operator to manage secrets to avoid putting them in plaintext in git, but all it does is just creating a secret in k8s from GCP secret manager. The rest is then taken by zrok scripts during bootstrap process (to add identities, etc). Maybe it's not a bad idea to add something like |
@pavars as @qrkourier has recognized I have already done a PR that is also about the helm pre-upgrade hook. My suggested change allows to omit this hook, by setting a value in the respective values file. This mechanism should also help you here in your case, I assume. |
Hi,
Trying to deploy self-hosted ZROK with openziti. The idea and the product seems nice but there is clear lack of documentation for properly secured and working configuration and seems like it is still in PoC stages. First of all helm charts don't support adding extraEnv variables from Secret Mounts (We are using external-secrets operator that pulls in secret data from GCP Secret Manager so we don't expose keys and secrets in plaintext manifests) which means that enrollmentJWT, ziti admin secret needs to be passed into helm chart as plaintext values.
Secondly, the helm hooks which create users, etc sometimes misbehave when deploying with ArgoCD. Having so many configuration issues that I keep redeploying zrok/ ziti and the users get created as part of bootstrap process and this leads to config drift from secrets/ ziti controller. Biggest issue is that secrets get regenerated and what is written in Ziti controller database doesn't match up to what is in the K8S secrets so initial login doesn't work. I see that there is support for postgres database for Zrok, so it can be scaled horizontally and still retain the same data, however the config part responsible for data store doesn't provide any flexibility to make required changes, it is hardcoded to use sqlite3. Also it is also unclear wether ziti-router is needed or is it enough to set ziti-controller-edge api as LoadBalancer service (docs say one thing but after testing it, conclusion is that ziti-router is required).
I tried mounting enrollmentJWT as additionalVolume and set .Values.enrollJwtFile to the mounted volume but that fails miserably. I can see and read the mounted token on the pod filesystem but for some reason Zrok controller fails, fallback to setting the same token explicitly in enrollmentJwt works fine. I might be wrong but feels like enrollmentJwt for ziti-router could also be bootstrapped from a script, so there is no need to manually login to ziti controller and create the router.
Another problem I ran into was creating new identities configs. When Zrok initially starts it tries to bootstrap and create required identity - ran into issue that identity "public" already exists so I had to manually drop the identity and create again. Additionally the new identity was created with ID
-D3xLHGw2
and when zrok frontend tried to start it was failing because it doesn't recognise configuration flag "-D3xLHGw2" passed on cli, this needs some proper escaping as seems like this is one of edge cases. In a DR scenario when these resources would be recreated, then all persistent data would be lost and all clients would have to reauthenticate with new tokens/ passwords, correct me if I'm wrong.The initial config might be good enough to server Zrok/ ziti locally but it is far from production-ready or even just to serve dev resources on GKE cluster.
Below added our current config for helm chart, however we will probably have to keep our own version of these helm charts since they seem to be lacking vital configuration options. My only concern is with maintaining scripts which are called for bootstrap etc. I could open a PR for helm charts to include support for mount envFrom: secrets/ configmaps properly if existingSecret is defined and also option to configure zrok ctrl.yaml with postgres DB.
The text was updated successfully, but these errors were encountered: