Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

/readyz endpoint returns 200 OK when not all enabled services are running #43440

Open
programmerq opened this issue Jun 24, 2024 · 1 comment
Labels
feature-request Used for new features in Teleport, improvements to current should be #enhancements

Comments

@programmerq
Copy link
Contributor

What would you like Teleport to do?

Introduce a new health-check endpoint (or modify the existing /readyz endpoint) that provides a 200 OK response only if all enabled services in the configuration are up and running without errors.

What problem does this solve?

Currently, the /readyz endpoint returns a 200 OK status as soon as the instance successfully heartbeats with the cluster.

This means that if one or more of the configured Teleport services (e.g., app_service) is not yet ready after, or never starts up properly, /readyz still returns a 200 OK. This is true as long as it was able to do a heartbeat of any kind.

A repeatable method to force a successful heartbeat, but have a broken service is to enable both the ssh_service and the app_service, and then try to join the cluster with a token that is good for the app role only. The app service starts up, the instance heartbeats, but the ssh_service never becomes healthy, all while /readyz returns 200 OK.

If a workaround exists, please include it.

I looked over the /metrics endpoint, hoping that health/status info for each service might be there, but it wasn't. There doesn't appear to be a good way to determine the readiness based on the status of the individual Teleport services.

/healthz will always return a 200 if the process is running. If it is determined that the current behavior of readyz should not be altered, an additional endpoint with the desired behavior would be great.

@programmerq programmerq added the feature-request Used for new features in Teleport, improvements to current should be #enhancements label Jun 24, 2024
@zmb3
Copy link
Collaborator

zmb3 commented Jun 27, 2024

Looks like this may be a duplicate of #11065.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request Used for new features in Teleport, improvements to current should be #enhancements
Projects
None yet
Development

No branches or pull requests

2 participants