Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the ability to add healthcheck definition for containers #1742

Closed
wants to merge 4 commits into from

Conversation

svagner
Copy link

@svagner svagner commented Mar 1, 2021

Original issue: #770

Adds follow options for container_image:

  • healthcheck_test
  • healthcheck_interval
  • healthcheck_timeout
  • healthcheck_start_period
  • healthcheck_retries

Original issue: bazelbuild#770

Adds follow options for container_image:
- healthcheck_test
- healthcheck_interval
- healthcheck_timeout
- healthcheck_start_period
- healthcheck_retries
@google-cla
Copy link

google-cla bot commented Mar 1, 2021

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here with @googlebot I signed it! and we'll verify it.


What to do if you already signed the CLA

Individual signers
Corporate signers

ℹ️ Googlers: Go here for more info.

@k8s-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: svagner
To complete the pull request process, please assign davegay after the PR has been reviewed.
You can assign the PR to them by writing /assign @davegay in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@svagner
Copy link
Author

svagner commented Mar 1, 2021

@googlebot I signed it!

@google-cla google-cla bot added cla: yes and removed cla: no labels Mar 1, 2021
@svagner
Copy link
Author

svagner commented Mar 1, 2021

cc @xingao267 , @alex1545 as you were in the discussion in the original issue

@zoidyzoidzoid
Copy link
Contributor

Hiding whitespace changes definitely makes this diff easier to review: https://github.com/bazelbuild/rules_docker/pull/1742/files?w=1

README.md Outdated
The test to perform to check that the container is healthy.</a></p>
<p>An empty list means to inherit the default.</p>
<p>Possible definition options are:</p>
<p><code>[]</code> : inherit healthcheck</p>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None also inherits the healthcheck. It may be nice to specify this for callers who write their own macros wrapping this.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added. Thanks!

@gravypod
Copy link
Collaborator

I don't see a reason why this couldn't be merged. The only concern I have is that we're overloading healthcheck_test to do multiple types of tests (CMD, CMD-SHELL, NONE, etc). Ideally I'd gravitate towards separating these out but I'm not sure that would make sense in this case due to this being a docker-specific rule.

I'm leaning towards thinking this is fine to include but I'd like the opinion of another maintainer just as a sanity check.

@pcj
Copy link
Member

pcj commented Mar 14, 2021

I think it's cool that you put this together, but I'm a little wary of this as-is for a couple of reasons:

  1. It's a bit telling that I've never heard of this feature before today. As far as I can tell, it's not supported in kubernetes but was a feature added on by Docker Inc for swarm specifically. I don't think this feature is going to get much mileage, and it comes at a time when we're trying to simplify this repo rather than add complexity.
  2. Given a Dockerfile only has the HEALTHCHECK command, why does rules_docker now need 5 new options to say the same thing? If there is no parsing library in google/go-containerregistry (upon which this repo depends) to parse that into the necessary config, then that library/function should probably there upstream before merging this. I don't think we should need more than a single new string or string_list attribute healthcheck for this.

@svagner
Copy link
Author

svagner commented Mar 18, 2021

I don't see a reason why this couldn't be merged. The only concern I have is that we're overloading healthcheck_test to do multiple types of tests (CMD, CMD-SHELL, NONE, etc). Ideally I'd gravitate towards separating these out but I'm not sure that would make sense in this case due to this being a docker-specific rule.

I'm leaning towards thinking this is fine to include but I'd like the opinion of another maintainer just as a sanity check.

Yeah. I also thought that probably it's better to have healthcheck_cmd and healthcheck_cmd_shell options. But it will lead to more complications:(

@svagner
Copy link
Author

svagner commented Mar 18, 2021

I think it's cool that you put this together, but I'm a little wary of this as-is for a couple of reasons:

  1. It's a bit telling that I've never heard of this feature before today. As far as I can tell, it's not supported in kubernetes but was a feature added on by Docker Inc for swarm specifically. I don't think this feature is going to get much mileage, and it comes at a time when we're trying to simplify this repo rather than add complexity.
  2. Given a Dockerfile only has the HEALTHCHECK command, why does rules_docker now need 5 new options to say the same thing? If there is no parsing library in google/go-containerregistry (upon which this repo depends) to parse that into the necessary config, then that library/function should probably there upstream before merging this. I don't think we should need more than a single new string or string_list attribute healthcheck for this.

Thanks for the feedback. Let me explain my thoughts about it.

  1. These rules are called rules_docker so I believe it means that we should support at least all the main features of docker. I agree that kube doesn't use docker's HEALTHCHECK since it has it's own livenessProbe but this plugin isn't dedicated to kube only is it?
  2. I think having healthcheck_test =" --interval = 5s --timeout = 10s --retries = 3 CMD curl -sS 127.0.0.1 "" will be a little bit confusing for users of these rules since interval / to / retries are optional parameters for health checking right? And unfortunately I don't follow your point about the parsing of this parameter in the google / go-containerregistry`. Since it is a single directive there is not necessary to parse it. But I'm again worrying about moving complications of the definition for health checks from the rules code to users.

Let me know if my arguments make sense to you.

@pcj
Copy link
Member

pcj commented Mar 19, 2021

I don't see how --interval = 5s --timeout = 10s --retries = 3 CMD curl -sS 127.0.0.1 could possibly be confusing to any user wanting to use this, because that's the syntax Docker came up with and has clearly documented.

And it is definitely necessary to parse it, because the configuration for it is not a simple string. It is https://github.com/moby/moby/blob/46cdcd206c56172b95ba5c77b827a722dab426c5/api/types/container/config.go#L16-L35:

// HealthConfig holds configuration settings for the HEALTHCHECK feature.
type HealthConfig struct {
	// Test is the test to perform to check that the container is healthy.
	// An empty slice means to inherit the default.
	// The options are:
	// {} : inherit healthcheck
	// {"NONE"} : disable healthcheck
	// {"CMD", args...} : exec arguments directly
	// {"CMD-SHELL", command} : run command with system's default shell
	Test []string `json:",omitempty"`

	// Zero means to inherit. Durations are expressed as integer nanoseconds.
	Interval    time.Duration `json:",omitempty"` // Interval is the time to wait between checks.
	Timeout     time.Duration `json:",omitempty"` // Timeout is the time to wait before considering the check to have hung.
	StartPeriod time.Duration `json:",omitempty"` // The start period for the container to initialize before the retries starts to count down.

	// Retries is the number of consecutive failures needed to consider a container as unhealthy.
	// Zero means inherit.
	Retries int `json:",omitempty"`
}

Part of the parser is here: https://github.com/moby/buildkit/blob/master/frontend/dockerfile/parser/line_parsers.go#L343-L369

But rules_docker has no dependency on moby library, it has a dependency on go-containerregistry, which has the same copy of the config: https://github.com/moby/buildkit/blob/master/frontend/dockerfile/parser/line_parsers.go#L343-L369.

Yes, as far as I know, there is no parsing logic in go-containerregistry to transform a string into that config. The parsing logic should be upstreamed there, in the same place that has the config. Once that logic is there, we could call parseHealthcheck here and get the correct config.

Also, rules_docker should not strive to replicate every feature in docker or a Dockerfile. A noteable absence is the RUN feature.

@svagner
Copy link
Author

svagner commented Mar 19, 2021

Ah, you meant about an unmarshaling line to the struct. Yeah. Makes sense. I'm not sure go-containerregistry will approve this change though since it looks like they don't need such a directive there. I can try to prepare PR for go-containerregistry for it but are we ready to add a health check directive here? I really just don't want to waste reviewers of go-containerregistry time and my time if we don't want this feature at all.

Also, rules_docker should not strive to replicate every feature in docker or a Dockerfile. A noteable absence is the RUN feature.

@pcj
Copy link
Member

pcj commented Mar 29, 2021

If we can get a parsing function upstreamed and simplify this PR to add only a single healthcheck attribute, I would support it.

@medzin
Copy link

medzin commented Aug 30, 2021

@svagner will you continue this PR?

@alexeagle
Copy link
Collaborator

Hey @jonjohnsonjr what do you think about the proposed upstream change to go-containerregistry?

@jonjohnsonjr
Copy link
Contributor

what do you think about the proposed upstream change to go-containerregistry?

That sounds reasonable to me, something like func ParseHealthConfig(s string) (*HealthConfig, error)?

I'm happy to review a PR.

@dgocoder
Copy link

dgocoder commented Jan 26, 2022

Any progress on this? Would be great to get this added!! If you want to use ecs with out elb/alb for internal services with service discovery the way they support health checks is through docker health check. Without this it becomes an issue.

@alexeagle
Copy link
Collaborator

@dgocoder I'm not sure if @svagner is still around to finish this work (by upstreaming the Go parts) or if it needs a new contributor to step up to finish.

@pjjw
Copy link

pjjw commented Jul 19, 2022

hi, i'm going to take a crack at picking this up and upstreaming said changes. posting to shame myself into followthrough.

@github-actions
Copy link

This Pull Request has been automatically marked as stale because it has not had any activity for 180 days. It will be closed if no further activity occurs in 30 days.
Collaborators can add an assignee to keep this open indefinitely. Thanks for your contributions to rules_docker!

@github-actions github-actions bot added the Can Close? Will close in 30 days unless there is a comment indicating why not label Jan 17, 2023
@nadavwe
Copy link

nadavwe commented Jan 26, 2023

hey @pjjw, any luck?
We would really love to see this merged.
currently we need to have a base image created in a different build to include the healthcheck, quite far from the application definition... :/

@pjjw
Copy link

pjjw commented Jan 26, 2023

no, have not plumbed this upstream yet, but this diff works for us as-is.

@github-actions github-actions bot removed the Can Close? Will close in 30 days unless there is a comment indicating why not label Jan 27, 2023
@github-actions
Copy link

This Pull Request has been automatically marked as stale because it has not had any activity for 180 days. It will be closed if no further activity occurs in 30 days.
Collaborators can add an assignee to keep this open indefinitely. Thanks for your contributions to rules_docker!

@github-actions github-actions bot added the Can Close? Will close in 30 days unless there is a comment indicating why not label Jul 27, 2023
@github-actions
Copy link

This PR was automatically closed because it went 30 days without a reply since it was labeled "Can Close?"

@github-actions github-actions bot closed this Aug 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Can Close? Will close in 30 days unless there is a comment indicating why not cla: yes size/L
Projects
None yet
Development

Successfully merging this pull request may close these issues.