Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add health probes to API server #485

Merged
merged 1 commit into from
Jun 6, 2023

Conversation

sayan-biswas
Copy link
Contributor

@sayan-biswas sayan-biswas commented May 30, 2023

Changes

Fixes #280
Fixes #414

This PR adds gRPC and REST health probe endpoints to the API server and configures the liveliness, readiness and startup probe for the API server deployment. Also adds retries for connecting to database to prevent unnecessary pod restarts.

Submitter Checklist

These are the criteria that every PR should meet, please check them off as you review them:

  • Has Docs included if any changes are user facing
  • Has Tests included if any functionality added or changed
  • Tested your changes locally (if this is a code change)
  • Follows the commit message standard
  • Meets the Tekton contributor standards (including functionality, content, code)
  • Has a kind label. You can add a comment on this PR that contains /kind <type>. Valid types are bug, cleanup, design, documentation, feature, flake, misc, question, tep
  • Release notes block below has been updated with any user-facing changes (API changes, bug fixes, changes requiring upgrade notices or deprecation warnings)
  • Release notes contain the string "action required" if the change requires additional action from users switching to the new release

Release Notes

- api: Add healthz endpoint to enable liveness, readiness, and startup probes for the api-server deployment. Default configuration uses HTTP, but gRPC can also be used.
- api: Poll up to two minutes when establishing the database connection.

@tekton-robot
Copy link

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@tekton-robot tekton-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note-none Denotes a PR that doesnt merit a release note. labels May 30, 2023
@tekton-robot tekton-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label May 30, 2023
@sayan-biswas
Copy link
Contributor Author

/kind feature

@tekton-robot tekton-robot added the kind/feature Categorizes issue or PR as related to a new feature. label May 30, 2023
@tekton-robot tekton-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-none Denotes a PR that doesnt merit a release note. labels May 30, 2023
@sayan-biswas sayan-biswas marked this pull request as ready for review May 30, 2023 18:42
@tekton-robot tekton-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 30, 2023
@sayan-biswas
Copy link
Contributor Author

/ok-to-test

@tekton-robot tekton-robot added the ok-to-test Indicates a non-member PR verified by an org member that is safe to test. label May 30, 2023
cmd/api/main.go Outdated Show resolved Hide resolved
cmd/api/main.go Show resolved Hide resolved
docs/api/README.md Outdated Show resolved Hide resolved
docs/api/README.md Show resolved Hide resolved
This adds gRPC and REST supported health endpoint to monitor server health, as well as individual service health. Also add retires for connecting to database.
Copy link
Contributor

@adambkaplan adambkaplan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks good, just one question about using HTTPS for health checks.

httpGet:
path: /healthz
port: 8080
scheme: HTTPS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does using HTTPS present an issue if the apiserver uses a certificate that is not globally trusted? This is the most likely deployment scenario, where the apiserver uses a cluster-signed/self-signed certificate, and the ingress re-encrypts the traffic.

Copy link
Contributor Author

@sayan-biswas sayan-biswas Jun 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adambkaplan It doesn't create an issue, because the Kubelet sends an HTTPS request skipping the certificate verification.

https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/#http-probes

@adambkaplan
Copy link
Contributor

/approve

Accepting as a feature.

@tekton-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: adambkaplan

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@tekton-robot tekton-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 5, 2023
@adambkaplan
Copy link
Contributor

Proposing updated release note:

- api: Add healthz endpoint to enable liveness, readiness, and startup probes for the apiserver deployment.
- api: Poll up to two minutes when establishing the database connection.

@enarha
Copy link
Contributor

enarha commented Jun 6, 2023

The change is LGTM, but lets give a chance to other people to review.

@alan-ghelardi
Copy link
Contributor

/lgtm

@tekton-robot tekton-robot added the lgtm Indicates that a PR is ready to be merged. label Jun 6, 2023
@enarha
Copy link
Contributor

enarha commented Jun 6, 2023

/test pull-tekton-results-integration-tests

@tekton-robot tekton-robot merged commit 59b01e5 into tektoncd:main Jun 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/feature Categorizes issue or PR as related to a new feature. lgtm Indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Set up health checking provided the service name Liveness and Readiness Probes for Results
5 participants