Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix 1583 #1584

Merged
merged 134 commits into from
Mar 8, 2024
Merged

Fix 1583 #1584

merged 134 commits into from
Mar 8, 2024

Conversation

wind57
Copy link
Contributor

@wind57 wind57 commented Feb 22, 2024

No description provided.

wind57 and others added 30 commits December 4, 2021 07:59
@@ -1,2 +1,11 @@
server:
port: 8761

# needed to disable the publication of InstanceRegisteredEvent
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

disable the InstanceRegisteredEvent in case of discovery-server

@@ -33,7 +33,8 @@ public class KubernetesClientProfileEnvironmentPostProcessor extends AbstractKub
@Override
protected boolean isInsideKubernetes(Environment environment) {
CoreV1Api api = new CoreV1Api();
KubernetesClientPodUtils utils = new KubernetesClientPodUtils(api, environment.getProperty(NAMESPACE_PROPERTY));
KubernetesClientPodUtils utils = new KubernetesClientPodUtils(api, environment.getProperty(NAMESPACE_PROPERTY),
false);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

false in here as the argument to fail-fast, to preserver the previous logic.

@@ -58,6 +58,9 @@ public class KubernetesClientPodUtils implements PodUtils<V1Pod> {

private final String serviceHost;

private final boolean failFast;

@Deprecated(forRemoval = true)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deprecate this constructor

this.failFast = false;
}

// mainly needed for the health and info contributors, so that they report DOWN
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

introduce a new constructor that takes a failFast as argument. For the health and info stuff, use failFast=true. For the environment post processor use failFast=false, in order to preserve compatibility

enabled: false


management:
Copy link
Contributor Author

@wind57 wind57 Feb 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

without this, curl .../actuator/health will show DOWN; but curl .../actuator/health/liveness and curl .../actuator/health/readiness will show UP

With this addition, all will show DOWN which I assume is what we want (at least that would be my expectation)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those should automatically be enabled when running on the platform...maybe I am misunderstanding

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you mean that KubernetesClientHealthIndicator and KubernetesClientInfoContributor should already be part of the liveness and readiness when running inside kubernetes, and they should not be added to:

management:
  endpoint:
    health:
      group:
        liveness:
          include: livenessState, kubernetes
        readiness:
          include: readinessState, kubernetes

right? If so, then its not how it works :) They need to be explicitly added, I admit I was not aware of this neither until I started working on this...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to dig into this some more, because this doesn't seem right

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -14,3 +14,11 @@ spec:
name: spring-cloud-kubernetes-k8s-client-discovery-server
port:
number: 8080

- path: /
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've merged the two ingress files we had in this test into a single one, because I want to assert for kubernetes health indicator being present also

@@ -35,7 +35,11 @@ class DiscoveryServerApplicationContextTests {

@Nested
@SpringBootTest(classes = TestConfig.class,
properties = "spring.cloud.kubernetes.http.discovery.catalog.watcher.enabled=true")
properties = { "spring.cloud.kubernetes.http.discovery.catalog.watcher.enabled=true",
"management.health.livenessstate.enabled=true",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding these 4 arguments requires a bit of explanation. I have added management.endpoint.health.group.liveness.include: livenessState, kubernetes. This means that besides the usual livenessState, we will also use KubernetesClientHealthIndicator to compute the liveness of discovery server.

This change means that when tests run, KubernetesClientHealthIndicator will try to be picked up also. Since it has a conditional in the form of : @ConditionalOnCloudPlatform(CloudPlatform.KUBERNETES), this means that enabling it (so that the test does not fail), would mean to enable that particular conditional. But if I do enabled it, it will also start the kubernetes informer discovery client, which will try to connect to a local k8s cluster and things get a lot more complicated for testing as such.

Instead, I decided to override the management.endpoint.health.group.liveness.include: livenessState, kubernetes, so that kubernetes is excluded for the test, thus these 4 annotations.

I've slightly changed an integration test to still be able to assert that this health indicator is present.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could a comment to this effect be added?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added, thank you.

@wind57 wind57 marked this pull request as ready for review February 25, 2024 19:44
@wind57
Copy link
Contributor Author

wind57 commented Feb 25, 2024

@ryanjbaxter this is ready, but its a bit involved and will require a bit of your eyes, specifically around the fact that I am changing the way liveness and readiness for discovery server work, but then again, the defect is about that. let me know if any question, thank you

@ryanjbaxter
Copy link
Contributor

I’m on vacation this week, I will look when I get back

@wind57
Copy link
Contributor Author

wind57 commented Mar 7, 2024

Just in case this went under the radar... Ping

@wind57 wind57 requested a review from ryanjbaxter March 8, 2024 20:11
@ryanjbaxter ryanjbaxter merged commit 70bc54e into spring-cloud:main Mar 8, 2024
14 checks passed
@codefromthecrypt
Copy link
Contributor

can you put a summary description? I think I can guess what the outcome is based on the changes and the notes on the individual changes. Is the summary that node permissions are still required, but you can skip failing on them? If you skip failing which health checks (if any) still pass? (e.g. /health, liveness readiness, etc)

@wind57
Copy link
Contributor Author

wind57 commented Mar 8, 2024

I read your note a couple of times, but I still dont understand what you mean :( I really want to answer, but I really fail to understand the question. Can you may be re-phrase it differently? Thank you upfront

@codefromthecrypt
Copy link
Contributor

no problem:

Can you describe the high level impact of this change?

#1583 says "discoveryserver: readiness probe shouldn't pass when permissions are incorrect"

  • So, now the readiness probe will no longer pass?
  • Does this also affect /health or liveness probe?
  • Are there any other changes that would affect deployment one should know about?
  • Is anything deprecated as a result of this change?

@wind57
Copy link
Contributor Author

wind57 commented Mar 9, 2024

So, now the readiness probe will no longer pass?

correct

Does this also affect /health or liveness probe?

correct, both.

Are there any other changes that would affect deployment one should know about?

nope.

Is anything deprecated as a result of this change?

nope.

@codefromthecrypt
Copy link
Contributor

Thanks, so basically now, all probes and /health will not pass when node permissions are absent. Thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

discoveryserver: readiness probe shouldn't pass when permissions are incorrect
4 participants