fix(zbchaos): retry SaaS detection and panic if it fails instead of falling back to Self-Managed mode #490

lenaschoenburg · 2024-02-06T13:13:24Z

I have added a dependency on a generic exponential backoff library because I didn't feel like writing this on my own. If querying for the CR still fails after retries are exhausted, we panic instead of just falling back to Self-Managed mode. I think this is preferable over retrying forever.

Closes #488

lenaschoenburg · 2024-02-06T14:29:09Z

Unfortunately this breaks test that depend on the fallback behavior. For example Test_CreateK8ClientWithPath which creates a client that tries to connect to https://127.0.0.1:46807. When building the client, we test if it's SaaS or SM. Since the URL is not reachable we can't query the CR and thus can't determine if it's SaaS or not.

Seems to me that we can't fix #488 without breaking a lot of tests 😅

ChrisKujawa · 2024-02-06T14:34:37Z

go-chaos/internal/saas.go

-		LogInfo("Failed to retrieve SaaS CRD, fallback to self-managed mode. %v", err)
-		return false
+		LogError("Failed to check for SaaS CRD, can't proceed without knowing if this is SaaS or Self-Managed: %v", err)
+		panic(err)


Let's not panic here :) If you want we can take a look at it together tomorrow.

But then what? We don't know if it's SaaS or not

My point I wanted to make is the following:

I don't want to have the panics in the internal package, you can see it as the lib or backend. It would be same as a library in java just does exit(1), it is unexpected and not nice. For me, it would be ok if you have this in the commands (it is the actual user of the libs, the user-facing code, the last layer).

I would like to see us to return an error and handle this by the consumer of the method.

I think we have here three cases to distinguish: a) CRD exists - SaaS b) CRD not exist - SM c) error -> return error. If currently not existing also causes an error (we have to find out/check this I guess, then we need to somehow need to distinguish these error cases.

Okay that part makes sense to me but I'm not sure how to handle the errors. We check for SaaS when building the k8sclient so very early on for every request. What can we do if that fails?

lenaschoenburg added 3 commits February 6, 2024 14:05

feat(zbchaos): add method for error logging

4735f0c

deps(zbchaos): add a dependency for exponential backoff

11789db

fix(zbchaos): retry detection of SaaS and panic on error

f8fca7d

lenaschoenburg requested a review from ChrisKujawa as a code owner February 6, 2024 13:13

lenaschoenburg mentioned this pull request Feb 6, 2024

Detection of SaaS vs SM environment incorrectly fallback to SM #488

Closed

ChrisKujawa reviewed Feb 6, 2024

View reviewed changes

fix(zbchaos): limit backoff to 1 minute instead of 15

1512500

ChrisKujawa mentioned this pull request Feb 9, 2024

Saas detection error handling #493

Merged

ChrisKujawa closed this in #493 Feb 9, 2024

ChrisKujawa closed this in cc9c847 Feb 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(zbchaos): retry SaaS detection and panic if it fails instead of falling back to Self-Managed mode #490

fix(zbchaos): retry SaaS detection and panic if it fails instead of falling back to Self-Managed mode #490

lenaschoenburg commented Feb 6, 2024

lenaschoenburg commented Feb 6, 2024

ChrisKujawa Feb 6, 2024

lenaschoenburg Feb 6, 2024

ChrisKujawa Feb 7, 2024 •

edited

Loading

lenaschoenburg Feb 7, 2024

fix(zbchaos): retry SaaS detection and panic if it fails instead of falling back to Self-Managed mode #490

fix(zbchaos): retry SaaS detection and panic if it fails instead of falling back to Self-Managed mode #490

Conversation

lenaschoenburg commented Feb 6, 2024

lenaschoenburg commented Feb 6, 2024

ChrisKujawa Feb 6, 2024

Choose a reason for hiding this comment

lenaschoenburg Feb 6, 2024

Choose a reason for hiding this comment

ChrisKujawa Feb 7, 2024 • edited Loading

Choose a reason for hiding this comment

lenaschoenburg Feb 7, 2024

Choose a reason for hiding this comment

ChrisKujawa Feb 7, 2024 •

edited

Loading